Efficient algorithms for biological stems search
Identifieur interne : 002006 ( Main/Exploration ); précédent : 002005; suivant : 002007Efficient algorithms for biological stems search
Auteurs : Tian Mi [États-Unis] ; Sanguthevar Rajasekaran [États-Unis]Source :
- BMC Bioinformatics [ 1471-2105 ] ; 2013.
Descripteurs français
- KwdFr :
- MESH :
English descriptors
- KwdEn :
- MESH :
- chemical , chemistry : DNA, RNA.
- Algorithms, Amino Acid Motifs, Nucleotide Motifs, Sequence Analysis, DNA, Sequence Analysis, Protein.
Abstract
Motifs are significant patterns in DNA, RNA, and protein sequences, which play an important role in biological processes and functions, like identification of open reading frames, RNA transcription, protein binding, etc. Several versions of the motif search problem have been studied in the literature. One such version is called the Planted Motif Search (PMS)or (
In this paper we propose an efficient algorithm for MSS and evaluate it on both synthetic and real data. This evaluation reveals that our algorithm is much faster than Kuksa and Pavlovic’s algorithm.
Our MSS algorithm outperforms the algorithm of Kuksa and Pavlovic in terms of the run time as well as the number of stems output. Specifically, the stems output by our algorithm form a proper (and much smaller)subset of the stems output by Kuksa and Pavlovic’s algorithm.
Url:
DOI: 10.1186/1471-2105-14-161
PubMed: 23679045
PubMed Central: 3679804
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: 000946
- to stream Pmc, to step Curation: 000946
- to stream Pmc, to step Checkpoint: 001233
- to stream PubMed, to step Corpus: 001C92
- to stream PubMed, to step Curation: 001C92
- to stream PubMed, to step Checkpoint: 001B58
- to stream Ncbi, to step Merge: 000A49
- to stream Ncbi, to step Curation: 000A49
- to stream Ncbi, to step Checkpoint: 000A49
- to stream Main, to step Merge: 002027
- to stream Main, to step Curation: 002006
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Efficient algorithms for biological stems search</title>
<author><name sortKey="Mi, Tian" sort="Mi, Tian" uniqKey="Mi T" first="Tian" last="Mi">Tian Mi</name>
<affiliation wicri:level="2"><nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
<placeName><region type="state">Connecticut</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Rajasekaran, Sanguthevar" sort="Rajasekaran, Sanguthevar" uniqKey="Rajasekaran S" first="Sanguthevar" last="Rajasekaran">Sanguthevar Rajasekaran</name>
<affiliation wicri:level="2"><nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
<placeName><region type="state">Connecticut</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">23679045</idno>
<idno type="pmc">3679804</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3679804</idno>
<idno type="RBID">PMC:3679804</idno>
<idno type="doi">10.1186/1471-2105-14-161</idno>
<date when="2013">2013</date>
<idno type="wicri:Area/Pmc/Corpus">000946</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000946</idno>
<idno type="wicri:Area/Pmc/Curation">000946</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000946</idno>
<idno type="wicri:Area/Pmc/Checkpoint">001233</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">001233</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:23679045</idno>
<idno type="wicri:Area/PubMed/Corpus">001C92</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001C92</idno>
<idno type="wicri:Area/PubMed/Curation">001C92</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">001C92</idno>
<idno type="wicri:Area/PubMed/Checkpoint">001B58</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">001B58</idno>
<idno type="wicri:Area/Ncbi/Merge">000A49</idno>
<idno type="wicri:Area/Ncbi/Curation">000A49</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000A49</idno>
<idno type="wicri:Area/Main/Merge">002027</idno>
<idno type="wicri:Area/Main/Curation">002006</idno>
<idno type="wicri:Area/Main/Exploration">002006</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Efficient algorithms for biological stems search</title>
<author><name sortKey="Mi, Tian" sort="Mi, Tian" uniqKey="Mi T" first="Tian" last="Mi">Tian Mi</name>
<affiliation wicri:level="2"><nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
<placeName><region type="state">Connecticut</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Rajasekaran, Sanguthevar" sort="Rajasekaran, Sanguthevar" uniqKey="Rajasekaran S" first="Sanguthevar" last="Rajasekaran">Sanguthevar Rajasekaran</name>
<affiliation wicri:level="2"><nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
<placeName><region type="state">Connecticut</region>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint><date when="2013">2013</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Algorithms</term>
<term>Amino Acid Motifs</term>
<term>DNA (chemistry)</term>
<term>Nucleotide Motifs</term>
<term>RNA (chemistry)</term>
<term>Sequence Analysis, DNA</term>
<term>Sequence Analysis, Protein</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr"><term>ADN ()</term>
<term>ARN ()</term>
<term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Analyse de séquence de protéine</term>
<term>Motifs d'acides aminés</term>
<term>Motifs nucléotidiques</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="chemistry" xml:lang="en"><term>DNA</term>
<term>RNA</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Algorithms</term>
<term>Amino Acid Motifs</term>
<term>Nucleotide Motifs</term>
<term>Sequence Analysis, DNA</term>
<term>Sequence Analysis, Protein</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr"><term>ADN</term>
<term>ARN</term>
<term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Analyse de séquence de protéine</term>
<term>Motifs d'acides aminés</term>
<term>Motifs nucléotidiques</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><sec><title>Background</title>
<p>Motifs are significant patterns in DNA, RNA, and protein sequences, which play an important role in biological processes and functions, like identification of open reading frames, RNA transcription, protein binding, etc. Several versions of the motif search problem have been studied in the literature. One such version is called the Planted Motif Search (PMS)or (<italic>l, d</italic>
)-motif Search. PMS is known to be NP complete. The time complexities of most of the planted motif search algorithms depend exponentially on the alphabet size. Recently a new version of the motif search problem has been introduced by Kuksa and Pavlovic. We call this version as the <italic>Motif Stems Search (MSS)</italic>
problem. A motif stem is an <italic>l</italic>
-mer (for some relevant value of <italic>l</italic>
)with some wildcard characters and hence corresponds to a set of <italic>l</italic>
-mers (without wildcards), some of which are (<italic>l, d</italic>
)-motifs. Kuksa and Pavlovic have presented an efficient algorithm to find motif stems for inputs from large alphabets. Ideally, the number of stems output should be as small as possible since the stems form a superset of the motifs.</p>
</sec>
<sec><title>Results</title>
<p>In this paper we propose an efficient algorithm for MSS and evaluate it on both synthetic and real data. This evaluation reveals that our algorithm is much faster than Kuksa and Pavlovic’s algorithm.</p>
</sec>
<sec><title>Conclusions</title>
<p>Our MSS algorithm outperforms the algorithm of Kuksa and Pavlovic in terms of the run time as well as the number of stems output. Specifically, the stems output by our algorithm form a proper (and much smaller)subset of the stems output by Kuksa and Pavlovic’s algorithm.</p>
</sec>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Mi, T" uniqKey="Mi T">T Mi</name>
</author>
<author><name sortKey="Merlin, Jc" uniqKey="Merlin J">JC Merlin</name>
</author>
<author><name sortKey="Deverasetty, S" uniqKey="Deverasetty S">S Deverasetty</name>
</author>
<author><name sortKey="Gryk, Mr" uniqKey="Gryk M">MR Gryk</name>
</author>
<author><name sortKey="Bill, Tj" uniqKey="Bill T">TJ Bill</name>
</author>
<author><name sortKey="Brooks, Aw" uniqKey="Brooks A">AW Brooks</name>
</author>
<author><name sortKey="Lee, Ly" uniqKey="Lee L">LY Lee</name>
</author>
<author><name sortKey="Rathnayake, V" uniqKey="Rathnayake V">V Rathnayake</name>
</author>
<author><name sortKey="Ross, Ca" uniqKey="Ross C">CA Ross</name>
</author>
<author><name sortKey="Sargeant, Dp" uniqKey="Sargeant D">DP Sargeant</name>
</author>
<author><name sortKey="Strong, Cl" uniqKey="Strong C">CL Strong</name>
</author>
<author><name sortKey="Watts, P" uniqKey="Watts P">P Watts</name>
</author>
<author><name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
<author><name sortKey="Schiller, Mr" uniqKey="Schiller M">MR Schiller</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gould, Cm" uniqKey="Gould C">CM Gould</name>
</author>
<author><name sortKey="Diella, F" uniqKey="Diella F">F Diella</name>
</author>
<author><name sortKey="Via, A" uniqKey="Via A">A Via</name>
</author>
<author><name sortKey="Puntervoll, P" uniqKey="Puntervoll P">P Puntervoll</name>
</author>
<author><name sortKey="Gemund, C" uniqKey="Gemund C">C Gemund</name>
</author>
<author><name sortKey="Chabanis Davidson, S" uniqKey="Chabanis Davidson S">S Chabanis-Davidson</name>
</author>
<author><name sortKey="Michael, S" uniqKey="Michael S">S Michael</name>
</author>
<author><name sortKey="Sayadi, A" uniqKey="Sayadi A">A Sayadi</name>
</author>
<author><name sortKey="Bryne, Jc" uniqKey="Bryne J">JC Bryne</name>
</author>
<author><name sortKey="Chica, C" uniqKey="Chica C">C Chica</name>
</author>
<author><name sortKey="Seiler, M" uniqKey="Seiler M">M Seiler</name>
</author>
<author><name sortKey="Davey, Ne" uniqKey="Davey N">NE Davey</name>
</author>
<author><name sortKey="Haslam, Nj" uniqKey="Haslam N">NJ Haslam</name>
</author>
<author><name sortKey="Weatheritt, Rj" uniqKey="Weatheritt R">RJ Weatheritt</name>
</author>
<author><name sortKey="Budd, A" uniqKey="Budd A">A Budd</name>
</author>
<author><name sortKey="Hughes, T" uniqKey="Hughes T">T Hughes</name>
</author>
<author><name sortKey="Pas, J" uniqKey="Pas J">J Pas</name>
</author>
<author><name sortKey="Rychlewski, L" uniqKey="Rychlewski L">L Rychlewski</name>
</author>
<author><name sortKey="Trave, G" uniqKey="Trave G">G Trave</name>
</author>
<author><name sortKey="Aasland, R" uniqKey="Aasland R">R Aasland</name>
</author>
<author><name sortKey="Helmer Citterich, M" uniqKey="Helmer Citterich M">M Helmer-Citterich</name>
</author>
<author><name sortKey="Linding, R" uniqKey="Linding R">R Linding</name>
</author>
<author><name sortKey="Gibson, Tj" uniqKey="Gibson T">TJ Gibson</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Obenauer, Jc" uniqKey="Obenauer J">JC Obenauer</name>
</author>
<author><name sortKey="Cantley, Lc" uniqKey="Cantley L">LC Cantley</name>
</author>
<author><name sortKey="Yaffe, Mb" uniqKey="Yaffe M">MB Yaffe</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Pevzner, Pa" uniqKey="Pevzner P">PA Pevzner</name>
</author>
<author><name sortKey="Hoi Sze, S" uniqKey="Hoi Sze S">S hoi Sze</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Price, A" uniqKey="Price A">A Price</name>
</author>
<author><name sortKey="Ramabhadran, S" uniqKey="Ramabhadran S">S Ramabhadran</name>
</author>
<author><name sortKey="Pevzner, Pa" uniqKey="Pevzner P">PA Pevzner</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Buhler, J" uniqKey="Buhler J">J Buhler</name>
</author>
<author><name sortKey="Tompa, M" uniqKey="Tompa M">M Tompa</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bailey, Tl" uniqKey="Bailey T">TL Bailey</name>
</author>
<author><name sortKey="Elkan, C" uniqKey="Elkan C">C Elkan</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lawrence, Ce" uniqKey="Lawrence C">CE Lawrence</name>
</author>
<author><name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
<author><name sortKey="Boguski, Ms" uniqKey="Boguski M">MS Boguski</name>
</author>
<author><name sortKey="Liu, Js" uniqKey="Liu J">JS Liu</name>
</author>
<author><name sortKey="Neuwald, Af" uniqKey="Neuwald A">AF Neuwald</name>
</author>
<author><name sortKey="Wootton, Jc" uniqKey="Wootton J">JC Wootton</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rocke, E" uniqKey="Rocke E">E Rocke</name>
</author>
<author><name sortKey="Tompa, M" uniqKey="Tompa M">M Tompa</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Keich, U" uniqKey="Keich U">U Keich</name>
</author>
<author><name sortKey="Pevzner, P" uniqKey="Pevzner P">P Pevzner</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Timothy, Lb" uniqKey="Timothy L">LB Timothy</name>
</author>
<author><name sortKey="Elkan, C" uniqKey="Elkan C">C Elkan</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hertz, Gz" uniqKey="Hertz G">GZ Hertz</name>
</author>
<author><name sortKey="Stormo, Gd" uniqKey="Stormo G">GD Stormo</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Eskin, E" uniqKey="Eskin E">E Eskin</name>
</author>
<author><name sortKey="Pevzner, Pa" uniqKey="Pevzner P">PA Pevzner</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Pisanti, N" uniqKey="Pisanti N">N Pisanti</name>
</author>
<author><name sortKey="Carvalho, A" uniqKey="Carvalho A">A Carvalho</name>
</author>
<author><name sortKey="Marsan, L" uniqKey="Marsan L">L Marsan</name>
</author>
<author><name sortKey="Sagot, Mf" uniqKey="Sagot M">MF Sagot</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
<author><name sortKey="Balla, S" uniqKey="Balla S">S Balla</name>
</author>
<author><name sortKey="Hsi Huang, C" uniqKey="Hsi Huang C">C hsi Huang</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Chin, Fyl" uniqKey="Chin F">FYL Chin</name>
</author>
<author><name sortKey="Leung, Hcm" uniqKey="Leung H">HCM Leung</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Davila, J" uniqKey="Davila J">J Davila</name>
</author>
<author><name sortKey="Balla, S" uniqKey="Balla S">S Balla</name>
</author>
<author><name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
<author><name sortKey="Dinh, H" uniqKey="Dinh H">H Dinh</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dinh, H" uniqKey="Dinh H">H Dinh</name>
</author>
<author><name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
<author><name sortKey="Kundeti, V" uniqKey="Kundeti V">V Kundeti</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bandyopadhyay, S" uniqKey="Bandyopadhyay S">S Bandyopadhyay</name>
</author>
<author><name sortKey="Sahni, S" uniqKey="Sahni S">S Sahni</name>
</author>
<author><name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kuksa, P" uniqKey="Kuksa P">P Kuksa</name>
</author>
<author><name sortKey="Pavlovic, V" uniqKey="Pavlovic V">V Pavlovic</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Davila, J" uniqKey="Davila J">J Davila</name>
</author>
<author><name sortKey="Balla, S" uniqKey="Balla S">S Balla</name>
</author>
<author><name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Connecticut</li>
</region>
</list>
<tree><country name="États-Unis"><region name="Connecticut"><name sortKey="Mi, Tian" sort="Mi, Tian" uniqKey="Mi T" first="Tian" last="Mi">Tian Mi</name>
</region>
<name sortKey="Rajasekaran, Sanguthevar" sort="Rajasekaran, Sanguthevar" uniqKey="Rajasekaran S" first="Sanguthevar" last="Rajasekaran">Sanguthevar Rajasekaran</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002006 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002006 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Main |étape= Exploration |type= RBID |clé= PMC:3679804 |texte= Efficient algorithms for biological stems search }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i -Sk "pubmed:23679045" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |